Structured Training for Large-Vocabulary Chord Recognition
نویسندگان
چکیده
Automatic chord recognition systems operating in the large-vocabulary regime must overcome data scarcity: certain classes occur much less frequently than others, and this presents a significant challenge when estimating model parameters. While most systems model the chord recognition task as a (multi-class) classification problem, few attempts have been made to directly exploit the intrinsic structural similarities between chord classes. In this work, we develop a deep convolutional-recurrent model for automatic chord recognition over a vocabulary of 170 classes. To exploit structural relationships between chord classes, the model is trained to produce both the time-varying chord label sequence as well as binary encodings of chord roots and qualities. This binary encoding directly exposes similarities between related classes, allowing the model to learn a more coherent representation of simultaneous pitch content. Evaluations on a corpus of 1217 annotated recordings demonstrate substantial improvements compared to previous models.
منابع مشابه
Large Vocabulary Automatic Chord Estimation with an Even Chance Training Scheme
This paper presents a large vocabulary automatic chord estimation system implemented using a bidirectional long short-term memory recurrent neural network trained with a skewed-class-aware scheme. This scheme gives the uncommon chord types much more exposure during the training process. The evaluation results indicate that: compared with a normal training scheme, the proposed scheme can boost t...
متن کاملA Hybrid Gaussian-HMM-Deep Learning Approach for Automatic Chord Estimation with Very Large Vocabulary
We propose a hybrid Gaussian-HMM-Deep-Learning approach for automatic chord estimation with very large chord vocabulary. The Gaussian-HMM part is similar to Chordino, which is used as a segmentation engine to divide input audio into note spectrogram segments. Two types of deep learning models are proposed to classify these segments into chord labels, which are then connected as chord sequences....
متن کاملStructured Support Vector Machines for Speech Recognition
Discriminative training criteria and discriminative models are two eective improvements for HMM-based speech recognition. is thesis proposed a structured support vector machine (SSVM) framework suitable for medium to large vocabulary continuous speech recognition. An important aspect of structured SVMs is the form of features. Several previously proposed features in the eld are summarized in ...
متن کاملMirex 2013: Large Vocabulary Chord Recognition System Using Multi-band Features and a Multi-stream Hmm
This paper describes the submitted systems to the MIREX 2013: Audio Chord Estimation task.
متن کاملRoles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition
Recently, deep learning techniques have been successfully applied to automatic speech recognition tasks -first to phonetic recognition with context-independent deep belief network (DBN) hidden Markov models (HMMs) and later to large vocabulary continuous speech recognition using context-dependent (CD) DBN-HMMs. In this paper, we report our most recent experiments designed to understand the role...
متن کامل